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SELF-IMPROVEMENT OF TEACHERS THROUGH 
SELF-RATING: A NEW SCALE FOR RATING 
TEACHERS' EFFICIENCY 



H. O. RUGG 
Lincoln School of Teachers College 



The discussion of rating scales for measuring teachers' dynamic 
quaHties is pertinent to two important administrative problems. 
The first is that of effective methods of training teachers in service. 
The second concerns the need for developing objective measures 
of teachers' efficiency for administrative purposes of marking and 
promotion. 

1. Need for making teachers critical of their work. — One of the 
most acute educational needs is that of training public-school 
teachers in service. As they enter the service teachers are typically 
untrained. Furthermore, tenure is short, the modal length of 
service being not more than two years. At least 175,000 teachers 
enter the service each year. Especially is this true in small towns 
and cities which receive the transients and wliich are really training 
centers for those who go on to the larger school systems. In large 
cities, there is a converse situation. There, long tenure is accom- 
panied by relatively few "training" interruptions. Automatic 
salary increases tend, when taken with long tenure and lack of 
stimulating training, to reduce teachers to routine "trade-workers." 
Such teachers rarely develop an attitude of constructive criticism 
of their work. 

2. Need for valid methods of judging teachers' efficiency. — Closely 
connected with this problem of training teachers in service, although 
the two are rarely tied together, is the problem of measuring 
teachers' efificiency. Nearly all school systems operate some kind 
of rating scheme. Teachers are commonly promoted on the basis 
of a rating given by principal, supervisor, or superintendent. 

Teachers' efficiency ratings are unreliable. — That these ratings 
are inadequate measures of efificiency is shown by reference to com- 
pilations of such ratings in various systems. Diagrams I and II 
illustrate the way in which efficiency ratings commonly distribute. 
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These diagrams report the efficiency ratings made by principals 
and assistant superintendents in one of our largest school systems. 
Note that 4,400 out of 7,100 elementary school teachers are marked 
either superior or excellent. Thus, the curve is distinctly skewed 
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Diagram I. — Distribution of efficiency ratings on 7,131 elementary teachers in 
one of our large city systems, 1917. 



toward the highest marks. In fact 96 per cent of all the teachers are 
either "superior," "excellent," or "good." The same situation is 
revealed from ratings given 893 high-school teachers, as shown in 
Diagram II. Such evidence can be duplicated for many cities, 
both large and small. For such a large number of teachers, there 
is much evidence to justify the conclusion that ratings which so 
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distribute cannot be reliable measures of success in teaching. From 
intimate knowledge of this particular school system, the writer 
knows that to be true. The need is obvious, therefore, for devices 
by which the dynamic qualities in teachers can be adequately 
rated. 
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Diagram IL — Distribution of efficiency ratings on 893 teachers in the high 
schools of one of our large city systems, 1917. 



Objective measurement has affected very slightly the administration 
of the teaching staff. — One's first thought in canvassing the rating 
of teaching efficiency is that objective methods of measuring should 
be applied to this field. The writer is one of those who would urge 
that, in the selection of teachers, one criterion of admittance should 
be intelligence. This can and should be measured by objective tests. 
With the rapid development of such scales, doubtless we shall soon 
have innovations in this direction. 

Complex dynamic qualities which cannot be measured by test. — 
Beyond the measurement of intelligence, however, the qualities in 
a teacher which are demanded for a successful contribution are 
complex traits — resultants of training operating on intelligence. 
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In these traits emotional and physical elements both contribute, 
and in many instances to an even larger extent than do intellectual 
elements. In a parallel article' the writer has shown the importance 
of measuring dynamic qualities — ^with special reference to high- 
school students. His discussion shows that measurement of these 
qualities must necessarily consist in the standardization of methods 
of judging human traits, and not in the measurement of educa- 
tional products. If it were possible we would measure the effi- 
ciency of a teacher by determining the effect of his instruction on the 
all-round lives of his pupils. The product of such instruction, 
however, eventuates slowly. Outcomes are remote, not immediate, 
except as they are purely intellectual in character. The intellectual 
products of a teacher's instruction can and ought to be measured 
each semester and year. We are very clear that one element which 
contributes to the rating of a teacher should be the actual productive 
effort as shown by the achievement of his pupils at intervals and 
at the end of the school year. 

But these intellectual qualities form only a small portion of the 
totality of a teacher's equipment. They ignore, for example, quali- 
ties of "team work," "loyalty to the school," "co-operation" with 
other teachers, "qualities of growth and keeping up-to-date," var- 
ious "personal" and "social" qualities, etc. Furthermore, they 
only indirectly measure "skill in teaching" and "skill in the mechan- 
ics of managing a class." 

The measurement of such qualities involves standardization of the 
process of judging. One observes a teacher teach. One judges of his 
efificiency. One observes and evaluates the extent to which he 
reveals definite aims, knows the subject-matter of his field, selects 
it wisely, organizes the class discussion spontaneously, asks ques- 
tions skilfully, has insight into how children learn, etc. One 
observes the extent to which the class work proceeds smoothly, 
whether the pupils attend spontaneously, whether discipline inheres 
in the work, etc. Hence the need for standardized methods of 
"observing" a process. 

THE MOVEMENT FOR THE RATING OF TEACHERS* EFFICIENCY 

The first proposal that a definite rating schedule be used in 
observing and measuring the work of teachers was made by E. C. 

1 School Review, XXVIII (May, 1920), 337-49. 
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Elliott in 1910. Following his suggestion some half dozen different 
attempts have been made to develop forms which would aid the 
administrative officer in rating qualities. Boyce^ conducted a 
detailed study, constructed a rating scheme, and experimented 
with it in many systems. The New York Bureau of Municipal 
Research gave wide publicity to a rating card which was so planned 
that definite questions were answered about the teachers. Beatty 
and C. H. Johnston experimented with a rating card at the Univer- 
sity of Illinois which dealt only with classroom activities of teachers. 
A number of articles have appeared discussing the use of these 
scales. 

Table I summarizes the chief features of the various schemes. 
A study of this table enables one to evaluate the movement 
definitely and briefly as follows: 

Rating scales for teachers have generally been inclusive of all 
traits. There has been a tendency to ignore the difference between 
traits revealed in the classroom and those which deal with other 
phases of the teachers' activities. These traits have always been 
grouped in some way, however. In describing traits the tendency 
has been to use single words or phrases instead of detailed questions. 
Some of the workers have tried to weight the qualities of a teacher, 
others to assign him to a group, "excellent," "superior," "medium," 
etc. The latter movement seems to be gaining headway. All num- 
bers of groups, from 2 to 10, have been tried. Rarely has the rating 
been done by a real ranking or direct man-to-man comparison. 

Movement to rate teachers is at a standstill. — ^The movement can- 
not be said to have succeeded however. The present writer believes 
it needs a new impetus and a new emphasis. In a number of large 
cities, and in some small cities, the movement has actually failed. 
Rating scales have been introduced, tried for a year or two, and 
then dropped as unsatisfactory. Nearly always they have been 
opposed by the teachers themselves. Frequently the principals 
and superintendents have been skeptical of their value. 

Three causes for the present inertia. — First, rating schemes are not 
aimed primarily at self-improvement. The basic reason for their 
failure has been the element of rating from above by an administra- 
tive officer. It is the viewpoint of the present writer that for a 

' A. C. BoYCE, "Methods of Sating Teachers' Efficiency." Fourteenth Yearbook of the National Society 
for the Study of Education, Part II. 
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rating scale to be truly helpful, its chief element must be self- 
improvement through self-rating. Improvement of teachers in 
service rests directly upon the initial step of self-criticism. It is 
conceivable that this could be stimulated by the personal exhorta- 
tion of the principal. It rarely is, however. It can be stimulated 
from within more helpfully and continuously, provided objective, 
impersonal schemes can be developed by which teachers can be 
made critically conscious of their strengths and weaknesses. Thus, 
rating schemes to the present time have revealed an important 
defect in that they were nearly always an administrative scheme 
superimposed from above. 

Secondly, qualities have been vaguely described, unobjective, and 
indefinite. — ^A second striking defect is that the traits have been 
described in vague terms. The rating ofificer and the teacher have 
rarely understood each other clearly. The content of early rating 
scales has made use of single words or brief phrases. Teachers were 
to be rated on sympathy, tact, integrity, enthusiasm, adaptability, 
resourcefulness, sense of justice, loyalty, etc. Rarely have such 
schemes been made concrete enough so that two or more rating 
ofificers rating the work of the same teacher could visualize precisely 
the same group of qualities. 

Thirdly, classification of traits not clear — qualities overlap. — A 
third defect of rating forms has been the duplication and over- 
lapping in scope of many of the qualities, for example, self-control 
and tact, judicial-mindedness and sense of justice, etc. 

TWO FEATURES IN THE SCALE PROPOSED HEREWITH: 

L Self-Improvement of Teachers Through Self-Rating 

The first purpose of this article is to suggest administrative 
devices by which self-improvement can be brought about through 
self-rating. Form A in the rating scale on pages 680 and 681 presents 
a definite suggestion in this direction. It consists of a classification 
of a teacher's qualities arranged in five groups: 

1. Skill in teaching 

2. Skill in the mechanics of managing a class 

3. Team work qualities 

4. Qualities of growth and keeping up-to-date 

5. Personal and social qualities. 
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This scheme has been developed through months of planning 
and conference with more than 100 high-school teachers and 
school principals and superintendents. It is believed that this 
classification results in little or no overlapping of qualities. The 
scheme consists of series of concrete questions, askii^ the teacher 
to rate himself on: "To what extent do I do thus and so?" Under 
"skill in teaching," one is asked to check himself in one of three 
groups, "low," "average," or "high," for example, on the extent to 
which he knows the subject-matter of his own and related fields; 
the extent to which he selects subject-matter effectively for class 
reading and discussion ; the extent to which he is skilful in conduct- 
ing class discussion ; etc. Contrasted with the practice of using single 
words or brief phrases, we are resorting here to the scheme of asking 
concrete questions arranged in sentence form. The rating process is 
made practical, furthermore, by stimulating the teacher to rate 
himself in simply one of three groups. Previous schemes have sug- 
gested the rating of teachers in as many as ten groups, some of five, 
some of seven. The writer is convinced that it is much more 
helpful to use a more abbreviated scheme. 

The device^ is directed at both the teacher and principal. To 
the teacher we say: "Rate yourself on each quality on this form. 
It will be a first step in self -improvement. It is important that you 
stand high in these qualities." To the principal or superintendent 
we say: "Let the teacher rate himself on each question at least once 
each term. Self-analysis is the first step in self-improvement. 
To analyze human qualities well, one needs a definite and detailed 
guide. For effective teacher rating, both teacher and administrator 
should rate and confer on specific qualities which make for good 
teaching. A valuable file of the administrator's analyses of his 
teachers can be kept in the office." 

With such a scheme the teacher is rating himself on the same 
form on which he is being rated by his principal or superintendent. 
Conditions are set up by which there can be sympathic understand- 
ing between the teacher and the principal concerning the work of 
the teacher. Misunderstandings will be avoided through a meeting 
of the two minds. Certainly no administrator should observe the 
work of a teacher, criticizing his work, without a thoroughgoing 

' Rating ScaUsfor Judging Teachers in Senice can be obtained from the University of Chicago Book- 
store, 5802 Ellis Avenue, Cliicago, Illinois. Price: In quantities under 200, 5 cents each; in quantities 
over 200, 4 cents each.; samples, 10 cents each. 
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and constructive conference. Finally, a valuable file of the adminis- 
trator's analyses of his teachers can be kept in the office by methods 
suggested above. 

n. Rating Teachers' Efficiency by Direct Comparison 

Form B of the rating scale suggests a type of rating scale which 
is new in the field of education. It is constructed along the general 
lines of the Army Rating Scale. The central idea is man-to-man 
comparison. A teacher is rated by comparing him on a group of 
qualities with a number of other teachers who have been selected 
very carefully to form the scale. It gives a method of assigning a 
teacher a numerical rating. The reliability of ratings on such a scale 
is known approximately. Three months of experimentation with 
the Army Rating Scale which was conducted by the present writer, 
enables us to predict closely the reliability of ratings made on this 
type of scale. Other ratings made on "equivalent" scales can be 
directly compared. Furthermore, it leads to a rating which cannot 
be confused with the commonly used percentage marking system 
of the public schools. 

The five groups of qualities on which the rating is done are the 
same as in Form A. In fact, in constructing the scale the signifi- 
cance of the terms "skill in teaching," "skill in mechanics of manag- 
ing a class," "team work qualities," etc., can be made clear by 
the careful reading of concrete questions listed under each corre- 
sponding heading in Form A. Note that the essence of man-to- 
man comparison is the selection of five teachers. They are so 
chosen as to represent respectively (1) the best teacher the rater 
has ever known; (5) the poorest teacher; (3) the representative 
average teacher; (2) the person midway between the best and the 
average; (4) the one midway between the average and poorest. 

A definite number of points is assigned to each of these five 
positions on the scale. For example, the best teacher one ever 
knew is assigned 38 points, the poorest 6, the others 30, 22, and 14, 
respectively. Thus a person is given a single numerical rating by 
totaling the points he is given on each of the five groups of quali- 
ties. In the case of teachers this is of especial importance in rating 
for promotion. 

The scale is so constructed that if a teacher represented "aver- 
age" on each group of qualities, he would receive a score of 110. A 
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teacher who represented "best" on each would receive a score of 
190, and the one who represented "poorest" would receive a score 
of 30. 

No weights are assigned to any one of the five groups of quali- 
ties. The writer experimented in great detail with the question of 
weighting with the Army Rating Scale. His conclusion was that the 
rank order of persons rated on such a scale is so closely the same on 
any weighted scheme as on an unweighted scheme that the latter 
is preferable because of the greater economy of manipulation. 

How TO Construct a Rating Scale 

FORM B 

A. Two important facts: 

1. It is very difficult to make a rating scale properly. A scale cannot be 
constructed in less than two or three hours. 

2. Once made, the scale needs but little modification from year to year. 

B. Necessary steps in the construction of a scale: 

There are three major steps in constructing a rating scale. 

First step. — Write down a list of 25-30 teachers ranging from the very best 
to the very poorest in your acquaintance, for each of whom you can answer the 
questions of Form A of the Rating Card. 

Important: The list must not contain less than 25 names. It must contain 
some very poor teachers, some very good teachers, and a considerable number of 
"average" teachers. 

Second step. — Arrange this list in rank order of merit from the "best" to the 
"poorest," separately for each of the five groups of gualities. 

Important: In ranking persons for one group of qualities (say "skill in teach- 
ing"), theother groups of qualities must be absolutely ignored. The arrangement 
of the list in rank order is the most difficult and important single step in the construc- 
tion of a rating scale. Experimentation has shown that to do it successfully, the 
most effective method is as follows: 

1. Locate each person in the list in one of three groups for each question in 
each group of qualities on Form A. A study of this checking should enable you 
to rearrange the teachers on your original list in from three to five groups. 

2. Next, therefore, group the teachers on the original list in at least three 
and, if possible, five groups — say put in one group the few who are markedly 
"best"; in another, the few who are clearly "poorest"; and the remainder in a 
third group who exhibit various degrees of "mediocrity." If the large mediocre 
group can next be separated into two or three groups, it will facilitate the next 
step, viz.: 

3. Rearrange the persons in each group so that they stand in exact rank 
order. This will be most difficult to do in connection with the "average" groups. 

4. Complete the final "rank order" arrangement of the entire list by com- 
paring the teachers near the limits of the adjacent groups, e.g., further direct 



Name of Teacher . 

School 

Subjects 



_Date of Rating- 



A Rating 

FORM A— For 



I. SKILL IN TEACfflNG 



To .what extent- 
Does he know the sahject matter of his own and related fields: 

1. In subjects like history, geography, etc., does he make effective 
use of material outside the text book 

2, Does he relate lessons to material in other fields and use illustra- 
tions outside his own subject (e.g., math, and scieoce)... .. ..... 

Does he select subject matter eSectiyely for class reading and discnssion. 

Are' his aims of teaching clearly defined > . ■ 

Does he give evidence of haring: 

1. Formulated clearly his aims of teaching, as shownby his written 
statement of aims and outcomes 

2. Planned his lessons specifically to carry these out 

3. Distinguished clearly between (a) "formal skill" (either in man- 
ual or academic subjects), (b) "information" and (c) "problem 
solving" as proper outcomes from his class work 

4. Given pupils clear ideas of the purposes of lessons 

Is he sinllful in condacting the class discussion 

a. Resourcefulness in organizing a discussion and in "thinking on his 
feet".. ..:.. 

1. Is he fertile and quick in taking advantage of pupils' questions. 

2. Are his question systematically planned, yet spontaneously 
given 

3. Does he express himself clearly 

b. Skill in conducting "drill" exercises 

1. Does he make use of economical, "timed," drill-devices (such 
as Courtis' Practice Exercises, etc.) , 

2. Does he properly subordinate drill to clear exposition, that is, 
keep a proper balance between drill and "development" 

C Ability to "develop" new phases of the work 

1. Are lessons well related to previous ones 

2. Is material "organized" ' 

3. Do lessons show the use of material in the solution of present or 
future problems: 

a. In his subject , , , 

b. Outside his subject 

d. Ability to secure class pajttcipation in the recitation 

1. Do all pupils in the class take-^art in the discussion 

2. Do the pupils question each other and conduct the class inde- 
pendently of his formal direction , , 

e. Skill in making the assignment , 

1. W« it an attempt to teach pupils how to study the le^on. . . 

2. Was it more than mere formal announcement of the number of 
pages in the text, etc 

3.^ Are its scope and purpose clearly recognized by pupils 

Has he insight into * how children leara" 

1. Does he keep the discussion within the pupils* comprehmsion . . . 

2. Does he endeavor to discover pupils* difiBculties by keeping records 
of errors and studying these 

3. Does he adapt discussion to individual digerences in pupils 



Snmmary Rating on Skill in Teaching. 



IL Skin in (he Mechanics of 
Managing a Class 



To what extent— 

1. Does the cla^ work proceed 
smoothly (without artificial in- 
terruptions and transitions from 
one kind of discussion to another) 

2. Do the pupils attend naturally 
and spontaneously to the work 
of the lesson ; . 



3. Does order or discipline inhere 
in the work (not maintained by 
compulsion or suppression) . . 



4. Is routine, as passing material, 
moving to the blackboard, etc, 
economically and systematical- 
ly organized 

5. Are material and equipment in 
the room effectively arranged . . 

6. Does he pay attention to the 
of heat, light and venti- 



details( 
lation . 



Summary Rating . 



TO THE TEACHER— Rate your 
stand high in these qualities 

TO THE PKINCIPAL OR SUP 

ysis is the first step in sel£-impr 
teacher rating, both teacher 
file of the administrator's anal 



FORM B 



RATING BY 

-The Rating Scale: Containing the names of types 
(Primara; for Principals and Sajier 



L Skin in Teaching 


U. Skill in the Mechanics ef 
Managing a Dasa 


ni. Team Work Qnalitiea 


Best 
Teacher ... 


38 

30 
22 

14 

6 




Best 

Teacher 

Better than 

Average 

Average 

Poorer than 

Average 

Poorest 

Teacher 


38 

30 
22 

14 

6 




Best 

Teacher 

Better than 

Average 

Average! 

Poorer than 

Average 

Poorest 

Teacher 


38 

30 
22 

14 

6 




Better than 
Average 
















Poorer than 

Average 

Poorest 

Teacher 














Sammarr Nomerical Bating 


Summary Namerical fiating 





Scale for Judging Teachers in Service 

SELF-IMPROVEMENT THROUGH SELF-RATING 

analyzing and rating the teacher's qualities — by the teacher himself and by the administrator. 

(Rate by checking in one of three groups for each question) 



III. Team Work 
Qualities 


1 


1 


i5 


IV. Qualities of Growth 
and Keeping Up-to-date 


fc 

3 


> 
< 


be 


V. Personal and Social 
Qaalities 


1 


< 


.» 


To what extent — 

1. Does he co-operate with other 
teachers in school activities 
(committee work, Parent- 
Teacher Association, etc.) 




To what extent— 

1. Does he read professional lit- 
erature, books, journals, etc. 

2. Does he participate in and 
contribute to the discussion 
of educational meetings 
(teachers' association, etc.) . . . 

3. Does he take extension 
courses, attend summer ses- 






To what extent-^ 

1. Does he attract people to 
him (i, e., is he mterested 
primarily in what othera 
are doing) 








2. Does he contribute to faculty 


2. Does he meet people easily. 

3. Does he recognize the im- 
portance of trimness in 
dress and general personal 


3. Is he loyal to the administra- 
tion and to other teachers 


group improvement of the 
schoo 


4. Does he experiment with new 
methods in teaching which 
others have suggested 

5. Does he invent and experi- 
ment with new methocls of 
teaching 

6. Does he heartily co-operate 
in investigational work in 
which various schools partici- 
pate 


4. Is he "fine-grained" (i. e. is 
he sensitive to social proprie- 
ties) 


5. Does he shoulder responsibil- 


ity for his own acts 


5. Does his impression of his 
own ability operate to 
handicap his effectiveness . . 

6. Is he effectively aggressive 
in conversation and confer- 
ence 


6. Do pupils go to him voluntar- 
ily for advice and conference. . 

7. Does he go out of his way to 
advise and help students 

8. Does he acquamt himself with 


pupils' home conditions where 


7. Is he tactful in dealing with 
pupils, colleagues and pa- 
trons 




7. Does he participate on com- 
mittees of associations in his 


9. Does he participate in com- 


. munity activities outside the 
school 


8. Does he "eventuate," i.e..- 
does he carry through 
projects which he starts. . . . 


8. Does he contribute to educa- 
tional literature 


10. Are his records and reports in 


on time and in complete form. 




Summary Rating 


Summary Rating 


Summary Rating 



SELF-IMPROVEMENT THROUGH SELF-RATING 

self on each quality on this form. It will be a first step in self-improvement. It is important that you 

ERINTENDENT — Let the teacher rate himself on each question at least once each term. Self-anal- 
ovement. To analyze human qualities well, one needs a definite and detailed guide. For effective 
and administrator should rate and confer on specific quaUties which make for good teaching. A valuable 
yses of his teachers can be kept in the office. 



DIRECT COMPARISON 

teachers who can be compared with the teacher to be rated. 

teadents in the Batiag of Teachers) 



IV. Qualities of Growth and Keeping 
lIp-to-Date 


V. Personal and Social Qnaliliea 


Best 
Teacher 


38 

30 

22 

14 
6 




Best 

Teacher...... 

Better than 

Average 

Average 

Poorer than 

Average 

Poorest 

Teacher 


38 

30 

22 

U 
6 




Better than 
Average 






Average 






Poorer than 
Average 






Poorest 
Teacher 






Summary Numerical Katin^. 


Summary Numerical Rating 



Total Nomerical Rating. 
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man-to-man comparison may result in interchanging individuals from, say, a 
"better than average" to an "average" group. 

When the original list is finally arranged in the rank order, you are ready 
for the 

Third step. — Select five persons to occupy the five positions on the scale in 
each group of qualities. Do this as follows: (1) Make a final decision as to which 
of the two or three persons in the "rank order" list is really the "best" one you 
ever knew and can now use for purposes of scale-comparison. (2) Select in same 
way the "poorest" for the scale. (3) From the two or three who stand nearest 
the middle of the list, decide upon the best one to represent the "average" position 
on the scale. (4) and (5) Do the same with the ones to occupy the position half- 
way between the average and best, and half-way between average and poorest. 
These are called respectively "better than average" and "poorer than average." 

Important: Experimentation and experience in the army have shown that the 
scale can be made properly only by carrying through these major steps separately 
for each group of qualities on the scale. 

How TO Rate Teachers on the Scale 

The rating is to be made for one group of qualities at a time, giving each 
person a stated number of points for that quality. It is done by comparing the 
person's qualities directly with those of the others whose names appear on the 
scale. Visualize each one as vividly as possible, thus locating a person at a par- 
ticular point on the scale. Be sure to give him the exact number of points that 
you think represents his position on the scale. The numerical values, 38, 30, 
22, 14, and 6 have been selected to give you considerable opportunity to assign 
values between these set points. For example, in the long run nearly as many 
should receive 23 or 21 as 22, which is the "average" point of the scale. 

Important: In case you are unable to decide clearly between the person you 
are rating and those whose names are on the scale, examine the results of checking 
the definitional questions in that group of qualities in Form A. This will enable 
you to compare them more concretely. 

The total rating of a person is obtained by adding the number of points given 
him on each of the five groups of qualities, writing this in the compartment of 
the card left for the total rating at the lower right-hand corner. 

Important: Scales for the rating of teachers should be made in group con- 
ference. All supervisors who rate the same set of teachers should come together 
and construct scales which will contain relatively the same names. It is fun- 
damentally important that names assigned to particular scale-positions ("best," 
"average," "poorest," etc.) be the same on the scales of different supervisors, 
otherwise the numerical ratings made against these scale-positions by various 
supervisors may not closely agree. If supervisors, in conference, can agree on the 
names of teachers to go on the scales, numerical ratings made against these will 
have very great reliability. 

The reliability of this scale. — It is perfectly clear that one of the 
most difficult tasks before administrators today is that of rating 
human character. At the same time the importance of successfully 
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doing it is obvious. A question that will arise concerning the 
discussion of this scale, therefore, is that of its reliability. 

The unreliability of current typical ratings of teachers by 
principals is so great that it is almost valueless. For three months 
in 1918, the writer carried on an experiment with the Army Rating 
Scale, both in Washington, and in a number of camps. For com- 
pleteness, for conclusiveness of evidence secured, and for objectivity 
of conditions, doubtless it will never be possible to duplicate the 
remarkable conditions basic to this investigation. Due to the un- 
usual circumstances of the Great War, we were able to set up condi- 
tions of amassing trained and well-educated men in groups for 
relatively long periods, through which they could become intimately 
acquainted with each other's traits. For the purpose of determining 
the validity of the Army Rating Scale, the War Department brought 
large groups of these men together for three-day conferences. 
Scales were constructed in conference. Officers rated each other 
under the most carefully controlled conditions of judging human 
character which we have yet been able to set up. This material 
was worked up statistically and interpreted carefully. No expense 
was spared to get at all of the statistical and psychological facts 
involved. A complete report was made upon this procedure to the 
War Department in December, 1918. The report establishes clearly, 
even for the experimental conditions under which we were then 
working, the very great difficulty which the person faces who 
attempts to rate the intangible qualities of a fellow-worker or 
subordinate. 

From this experiment it can be predicted that each ordinary 
efficiency rating on a teacher made by current methods and on a 
scale which would run from 100 to will have a probable error as 
large as 10 to IS points. This means that if an administrator 
merely wishes to divide his teachers into five groups, "superior," 
"excellent," "good," "medium," and "poor," the chances are very 
remote that the teacher will be assigned to the proper group. With 
the experimental ratings of the army investigation, we were able 
to reduce the probable error on a man-to-man comparison scale 
such as is reported herewith to about 5 points on a 100-unit scale. 
On the new scale reported herewith it is between 9 and 10 points on 
a 190-unit scale. 

Even this degree of reliability is somewhat unsatisfactory. — ^There 
is not a large enough degree of probability that a teacher will be 
properly located in the right group. How can the reliability be 
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still further increased? (1) Objectify the procedure of rating. This 
has been done in large part by a new method of constructing the 
scale, which was reported in the preceding paragraph. It is abso- 
lutely important that each step itemized above be followed care- 
fully. It is laborious; it cannot be done in less than several hours. 
To construct a scale properly, one ought to spend from eight to ten 
hours. Once the scale is made, however, it is almost permanent, 
and needs to be changed but slightly from time to time. (2) 
Increase the number of independent ratings. The surer way of increas- 
ing the reliability of the scale, however, is that of increasing the 
number of ratings. Although the probable error of a single rating 
on such a scale as this ranging from 30 to 190, will be very likely 9 
or 10 points, the probable error of three independent ratings will 
be between 5 and 6 points. This assures great probability that a 
teacher will be located in his proper group. This reliability is great 
enough. 

It is clear that no single rating on teachers should be used as a 
measure of that teacher's efficiency. Conditions should be found 
by which at least two administrative officers can rate each teacher. 
If not, the final rating on a teacher should certainly be the average of 
several independent ratings of the same officer. 



